AITopics | glottal source

Collaborating Authors

glottal source

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Modeling and Estimation of Vocal Tract and Glottal Source Parameters Using ARMAX-LF Model

Lia, Kai, Akagia, Masato, Lib, Yongwei, Unokia, Masashi

arXiv.org Artificial IntelligenceOct-6-2024

Modeling and estimation of the vocal tract and glottal source parameters of vowels from raw speech can be typically done by using the Auto-Regressive with eXogenous input (ARX) model and Liljencrants-Fant (LF) model with an iteration-based estimation approach. However, the all-pole autoregressive model in the modeling of vocal tract filters cannot provide the locations of anti-formants (zeros), which increases the estimation errors in certain classes of speech sounds, such as nasal, fricative, and stop consonants. In this paper, we propose the Auto-Regressive Moving Average eXogenous with LF (ARMAX-LF) model to extend the ARX-LF model to a wider variety of speech sounds, including vowels and nasalized consonants. The LF model represents the glottal source derivative as a parametrized time-domain model, and the ARMAX model represents the vocal tract as a pole-zero filter with an additional exogenous LF excitation as input. To estimate multiple parameters with fewer errors, we first utilize the powerful nonlinear fitting ability of deep neural networks (DNNs) to build a mapping from extracted glottal source derivatives or speech waveforms to corresponding LF parameters. Then, glottal source and vocal tract parameters can be estimated with fewer estimation errors and without any iterations as in the analysis-by-synthesis strategy. Experimental results with synthesized speech using the linear source-filter model, synthesized speech using the physical model, and real speech signals showed that the proposed ARMAX-LF model with a DNN-based estimation method can estimate the parameters of both vowels and nasalized sounds with fewer errors and estimation time.

armax-lf model, glottal source, speech, (9 more...)

arXiv.org Artificial Intelligence

2410.04704

Country:

Asia > Japan (0.04)
Asia > India (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A speech corpus for chronic kidney disease

Mun, Jihyun, Kim, Sunhee, Kim, Myeong Ju, Ryu, Jiwon, Kim, Sejoong, Chung, Minhwa

arXiv.org Artificial IntelligenceNov-3-2022

In this study, we present a speech corpus of patients with chronic kidney disease (CKD) that will be used for research on pathological voice analysis, automatic illness identification, and severity prediction. This paper introduces the steps involved in creating this corpus, including the choice of speech-related parameters and speech lists as well as the recording technique. The speakers in this corpus, 289 CKD patients with varying degrees of severity who were categorized based on estimated glomerular filtration rate (eGFR), delivered sustained vowels, sentence, and paragraph stimuli. This study compared and analyzed the voice characteristics of CKD patients with those of the control group; the results revealed differences in voice quality, phoneme-level pronunciation, prosody, glottal source, and aerodynamic parameters.

artificial intelligence, machine learning, whitney 0, (16 more...)

arXiv.org Artificial Intelligence

2211.01705

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > South Korea > Seoul > Seoul (0.05)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback